Porting an european portuguese broadcast news recognition system to brazilian portuguese

نویسندگان

  • Alberto Abad
  • Isabel Trancoso
  • Nelson Neto
  • Céu Viana
چکیده

This paper reports on recent work in the context of the activities of the PoSTPort project aimed at porting a Broadcast News recognition system originally developed for European Portuguese to other varieties. Concretely, in this paper we have focused on porting to Brazilian Portuguese. The impact of some of the main sources of variability has been assessed, besides proposing solutions at the lexical, acoustic and syntactic levels. The ported Brazilian Portuguese Broadcast News system allowed a drastic performance improvement from 56.6% WER (obtained with the European Portuguese system) to 25.5%.

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

The L2F Broadcast News Speech Recognition System

Broadcast news play an important role in our lives providing access to news, information and entertainment. The existence of an automatic transcription is an important medium that not only can provide subtitles for inclusion of people with special needs or be an advantage on noisy and populated environments, but also because it enables data search and retrieve capabilities over the multimedia s...

متن کامل

Aligning and recognizing spoken books in different varieties of Portuguese

This paper tries to present digital spoken books as a useful diagnostic tool for detecting alignment and recognition problems and for studying the porting of these technologies to different varieties of the same language Portuguese, in our case. We summarize the main differences between European and Brazilian Portuguese (EP/BP) and describe how they affect the GtoP system. Despite the small siz...

متن کامل

Language and variety verification on broadcast news for Portuguese

This paper describes a language/accent verification system for Portuguese, that explores different type of properties: acoustic, phonotactic and prosodic. The two-stage system is designed to be used as a pre-processing module for the Portuguese Automatic Speech Recognition (ASR) system developed at INESC-ID. As the ASR system is applied everyday to transcribe the evening news from a Portuguese ...

متن کامل

Digital Talking Books in Multiple Languages and Varieties

This paper describes our work in digital talking book alignment, starting by our earlier efforts for the alignment of books in European Portuguese, and ending with the two challenges we are currently facing of aligning books in different varieties of Portuguese and aligning parallel books in different languages. Our alignment module proved robust enough for porting to other varieties of Portugu...

متن کامل

Statistical Machine Translation of Broadcast News from Spanish to Portuguese

In this paper we describe the work carried out to develop an automatic system for translation of broadcast news from Spanish to Portuguese. Two challenging topics of speech and language processing were involved: Automatic Speech Recognition (ASR) of the Spanish News and Statistical Machine Translation (SMT) of the results to the Portuguese language. ASR of broadcast news is based on the AUDIMUS...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2009